Convolutional neural network processor, image processing method and electronic device

ABSTRACT

The present disclosure discloses a convolutional neural network processor, an image processing method and an electronic device. The method includes: receiving, by the first convolutional unit, the input image to be processed, extracting the N feature maps with different scales in the image to be processed, sending the N feature maps to the second convolutional unit, and sending the first feature map to the processing unit; fusing, by the processing unit, the received preset noise information and the first feature map, to obtain the second feature map, and sending the second feature map to the second convolutional unit; and fusing, by the second convolutional unit, the received N feature maps with the second feature map to obtain the processed image.

The present application claims the priority from Chinese PatentApplication No. 201910940055.3, filed with the Chinese Patent Office onSep. 30, 2019, and entitled “CONVOLUTIONAL NEURAL NETWORK PROCESSOR,IMAGE PROCESSING METHOD AND ELECTRONIC DEVICE”, which is herebyincorporated by reference in its entirety.

FIELD

The present application relates to the technical field of imageprocessing, in particular to a convolutional neural network processor,an image processing method and an electronic device.

BACKGROUND

With the rapid development of image processing technologies, people havehigher and higher requirements for image quality. The image enhancementtechnology is widely used in over-exposed or under-exposed imageprocessing for its advantages of improving image contrast and viewingeffect.

At present, in order to improve the efficiency of image processing,convolutional neural networks are often used to process a low-qualityimage into a high-quality image. However, in the prior art, during imageprocessing using convolutional neural networks, a dense linkarchitecture is adopted to process the image as a whole, so in theprocess of image processing, not only luminance information but alsochrominance information need to be processed, which makes the workloadof image processing using the convolutional neural networks large andfurther reduces the efficiency of image processing.

SUMMARY

The present disclosure provides a convolutional neural networkprocessor, an image processing method and an electronic device. In afirst aspect, the embodiments of the present disclosure provide aconvolutional neural network processor, including a first convolutionalunit, a processing unit and a second convolutional unit; where

the first convolutional unit includes N first convolutional layersconnected in sequence, and is configured to extract N feature maps withdifferent scales in an image to be processed, where N is a positiveinteger, each first convolutional layer is configured to extract onefeature map;

the processing unit is connected with the first convolutional unit andthe second convolutional unit, and is configured to fuse at least onepiece of preset noise information received and a first feature map witha smallest scale in the N feature maps with different scales extractedby the first convolutional unit to obtain a fused second feature map;and

the second convolutional unit includes N second convolutional layersconnected in sequence, and is configured to fuse the N feature mapsextracted by the first convolutional unit with the second feature map toobtain a processed image.

Alternatively, the convolutional neural network processor furtherincludes: 2N sampling units, where first N sampling units are scramblingunits, an output end of each first convolutional layer is provided witha scrambling unit, which is configured to down-sample a feature imageoutput by each first convolutional layer, and output of each scramblingunit serves as input of next first convolutional layer; and last Nsampling units are merging units, and an output end of each secondconvolutional layer is provided with a merging unit, which is configuredto up-sample a feature image output by each second convolutional layer.

Alternatively, the convolutional neural network processor furtherincludes N interlayer connections configured to directly input an outputof each first convolutional layer into a corresponding secondconvolutional layer, where the first convolutional layers are inone-to-one correspondence to the second convolutional layers.

Alternatively, the processing unit includes a plurality of convolutionalblocks connected in sequence, and output of each of the convolutionalblocks is input of all subsequent convolutional blocks, where each ofthe convolutional blocks includes a third convolutional layer and afourth convolutional layer.

Alternatively, the convolutional neural network processor furtherincludes N+1 fifth convolutional layers, where the fifth convolutionallayers are disposed at input ends of the processing unit and each of thesecond convolutional layers for performing superposition processing on aplurality of input data.

Alternatively, each first convolutional layer, each second convolutionallayer, each third convolutional layer, each fourth convolutional layer,and each fifth convolutional layer include a 1×1 convolutional kernel,respectively.

Alternatively, the noise information includes first noise informationand second noise information, where a difference between an averagevalue of all elements in the first noise information and a maximumluminance grayscale is smaller than a first preset threshold, and adifference between an average value of all elements in the second noiseinformation and a minimum luminance grayscale is smaller than a secondpreset threshold.

In a second aspect, the embodiments of the present disclosure provide animage processing method, applied to the convolutional neural networkprocessor described in the first aspect, and the method includes:

receiving, by the first convolutional unit, the input image to beprocessed, extracting the N feature maps with different scales in theimage to be processed, sending the N feature maps to the secondconvolutional unit, and sending the first feature map to the processingunit, where N is the positive integer, the first feature map is thefeature map with the smallest scale in the N feature maps with differentscales;

fusing, by the processing unit, the received preset noise informationand the first feature map, to obtain the second feature map, and sendingthe second feature map to the second convolutional unit; and

fusing, by the second convolutional unit, the received N feature mapswith the second feature map to obtain the processed image.

Alternatively, the extracting the N feature maps with different scalesin the image to be processed includes:

acquiring, by each first convolutional layer in the first convolutionalunit, a preset first convolutional weight matrix; and

performing, by each first convolutional layer, convolutional operationon a feature map output by a previous adjacent convolutional layer andthe first convolutional weight matrix corresponding to the each firstconvolutional layer, to obtain the N feature maps with the differentscales.

Alternatively, the convolutional operation, performed by each firstconvolutional layer, on the feature map output by the previous adjacentconvolutional layer and the first convolutional weight matrixcorresponding to the each first convolutional layer includes:

if any first convolutional layer is a first one of the convolutionallayers in the first convolutional unit, performing, by the any firstconvolutional layer, convolutional operation on the image to beprocessed and a first convolutional weight matrix corresponding to theany first convolutional layer to obtain a feature map; or

if any first convolutional layer is not the first one of theconvolutional layers in the first convolutional unit, performing, by theany first convolutional layer, convolutional operation on the featuremap output by the previous adjacent first convolutional layer and afirst convolutional weight matrix corresponding to the any firstconvolutional layer to obtain a feature map.

Alternatively, after the convolutional operation on the image to beprocessed and the first convolutional weight matrix corresponding to theany first convolutional layer is performed, or after the convolutionaloperation on the feature map output by the previous adjacent firstconvolutional layer and the first convolutional weight matrixcorresponding to the any first convolutional layer is performed, themethod further includes:

down-sampling, by a scrambling unit, the obtained feature map.

Alternatively, the fusing, by the second convolutional unit, thereceived N feature maps with the second feature map to obtain theprocessed image includes:

acquiring, by each second convolutional layer in the secondconvolutional unit, a preset second convolutional weight matrix; and

performing, by each second convolutional layer, convolutional operationon a feature map output by a corresponding first convolutional layer anda feature map output by a previous adjacent second convolutional layerto obtain the processed image.

Alternatively, the convolutional operation, performed by each secondconvolutional layer, on the feature map output by the correspondingfirst convolutional layer and the feature map output by the previousadjacent second convolutional layer to obtain the processed imageincludes:

if any second convolutional layer is a first one of the convolutionallayers in the second convolutional unit, performing convolutionaloperation on a third feature map and the second convolutional weightmatrix corresponding to the any second convolutional layer to obtain afeature map, where the third feature map is obtained by stacking areceived first feature map and the second feature map by a fifthconvolutional layer; or

if any second convolutional layer is not the first one of theconvolutional layers in the second convolutional unit, stacking thefeature map output by the previous adjacent second convolutional layerwith a feature map output by a corresponding first convolutional layerto obtain a fourth feature map, and performing convolutional operationon the fourth feature map and the second convolutional weight matrixcorresponding to the any second convolutional layer to obtain a featuremap.

Alternatively, after the convolutional operation on the third featuremap and any second convolutional weight matrix to obtain the feature mapis performed, or after the convolutional operation on the fourth featuremap and the second convolutional weight matrix corresponding to the anysecond convolutional layer to obtain the feature map is performed, themethod further includes:

up-sampling, by a merging unit, the feature map.

Alternatively, the fusing, by the processing unit, the received presetnoise information and the first feature map, to obtain the secondfeature map includes:

receiving, by a fifth convolutional layer between the processing unitand the first convolutional unit, the input noise information and thefirst feature map, and stacking the first feature map and the noiseinformation to obtain a fifth feature map;

acquiring, by each convolutional block in the processing unit, a presetthird convolutional weight matrix; and

performing, by each convolutional block, convolutional operation on thefeature maps output by all previous convolutional blocks and the thirdconvolutional weight matrix corresponding to each convolutional block toobtain the second feature map.

Alternatively, the dimensions of each first convolutional weight matrix,each second convolutional weight matrix, and each third convolutionalweight matrix are all 1×1.

In a third aspect, the embodiments of the present disclosure provide anelectronic device, and the electronic device includes:

a memory for storing instructions executed by at least one processor;and

the processor for acquiring and executing the instructions stored in thememory to implement the method described in the second aspect.

In a fourth aspect, the embodiments of the present disclosure provide acomputer-readable storage medium, and the computer-readable storagemedium stores computer instructions that cause the computer to performthe method described in the second aspect when executed on a computer.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of the working principle of a scramblingunit provided by the embodiments of the present disclosure.

FIG. 2 is a schematic diagram of the working principle of a merging unitprovided by the embodiments of the present disclosure.

FIG. 3 is a structural schematic diagram of a convolutional neuralnetwork provided by the embodiments of the present disclosure.

FIG. 4 is a structural schematic diagram of a processing unit providedby the embodiments of the present disclosure.

FIG. 5 is a training flowchart of a convolutional neural networkprovided by the embodiments of the present disclosure.

FIG. 6 is a structural schematic diagram of a convolutional neuralnetwork generator provided by the embodiments of the present disclosure.

FIG. 7 is a structural schematic diagram of an analysis network providedby the embodiments of the present disclosure.

FIG. 8 is a structural schematic diagram of a discriminator provided bythe embodiments of the present disclosure.

FIG. 9 is a data flow diagram of discriminator training provided by theembodiments of the present disclosure.

FIG. 10 is a flowchart of an image processing method provided by theembodiments of the present disclosure.

FIG. 11 is a principle diagram of a convolutional operation provided bythe embodiments of the present disclosure.

FIG. 12 is a principle diagram of convolutional layer data superpositionprovided by the embodiments of the present disclosure.

FIG. 13 is a structural schematic diagram of an electronic deviceprovided by the embodiments of the present disclosure.

DETAILED DESCRIPTION OF THE EMBODIMENTS

In the solution provided by the embodiments of the present disclosure,the described embodiments are only part of the embodiments of thepresent disclosure, not all of the embodiments. Based on the embodimentsin the present disclosure, all other embodiments obtained by those ofordinary skill in the art without creative labor are within the scope ofprotection of the present disclosure.

In order to better understand the above-mentioned technical solution,the technical solution of the present disclosure is described in detailbelow through the drawings and specific embodiments. It should beunderstood that the embodiments of the present disclosure and thespecific features in the embodiments are the detailed description of thetechnical solution of the present disclosure, rather than a limitationon the technical solution of the present disclosure. Without conflict,the embodiments of the present disclosure and the technical features inthe embodiments can be combined mutually.

The technical terms mentioned in the embodiments of the presentdisclosure will be explained below.

Convolutional Neural Network (CNN) is a kind of feedforward neuralnetwork with convolutional calculation and with a deep structure. It isone of the representative algorithms of deep learning.

A convolutional kernel refers to a two-dimensional data matrix, and eachpoint has a certain value, which is used to extract features of an inputimage or add the features to the image.

A convolutional layer includes one or more convolutional kernels forperforming convolutional operation on the input image to obtain anoutput image.

A pooling layer is a kind of down-sampling, which is used to reduce thesize of the extracted image features. Commonly used pooling layersinclude Max-pooling, avg-polling, decimation, demuxout, etc.

A flatten layer is configured to convert multidimensional data intoone-dimensional data, and is commonly used for transition between aconvolutional layer and a fully connected layer. The formula for theflatten layer is as follows:

$v_{k} = f_{\frac{k}{j},{k\% j}}$

where v_(k) refers to a vector containing k elements, and

$f_{\frac{k}{j},{k\% j}}$

refers to a matrix wan k/j rows and k % j columns.

A homogenization layer uses the mean value of feature images torepresent an image and converts multidimensional feature data into onescalar datum.

A fully connected layer (FCN) has a same structure as the convolutionalneural network, and uses scalar values instead of convolutional kernels.

A softmax layer refers to a logic function generator which compressesthe value of each element in the k-dimensional vector so that the valuerange of each element is within (0,1) and the sum of all the elements inthe k-dimensional vector is equal to 1,

${{\sigma (z)} = \frac{e^{z_{j}}}{\sum\limits_{k = 1}^{K}\; e^{z_{k}}}},{j = 1},{2\mspace{14mu} \cdots \mspace{14mu} \cdots \mspace{14mu} K}$

where σ(z) is the compressed k-dimensional vector, e^(z) ^(j) is thevalue of the j^(th) element in the k-dimensional vector, and Σ_(k=1)^(K)e^(z) ^(k) represents the sum of all the elements in thek-dimensional vector.

An instance normalization unit is configured to normalize a feature mapoutput by the convolutional layer. Specifically, the standardizedformula for instance normalization is as follows:

$y_{tijk} = \frac{x_{tijk} - \mu_{ti}}{\sqrt{\sigma_{ti}^{2} + ɛ}}$$\mu_{ti} = {\frac{1}{HW}{\sum\limits_{l = 1}^{W}\; {\sum\limits_{k = 1}^{H}\; x_{tilk}}}}$$\sigma_{ti}^{2} = {\frac{1}{HW}{\sum\limits_{l = 1}^{W}\; {\sum\limits_{k = 1}^{H}\; ( {x_{tilk} - {m\; \mu_{ti}}} )^{2}}}}$

where x_(tijk) represents values of the t^(th) block, the i^(th) featuremap, the j^(th) column, and the k^(th) row in a feature map set outputby any convolutional layer; H represents the number of rows of a matrixof each feature map; W represents the number of columns of the matrix ofeach feature map; μ_(ti) representing the mean value of each element inthe t^(th) block and the i^(th) feature map in the feature map setoutput by any convolutional layer; m is a preset coefficient; and σ_(ti)² indicates the mean square deviation of the values of the t^(th) block,the i^(th) feature map, the j^(th) column, and the k^(th) row in thefeature map set output by any convolutional layer.

A scrambling unit (Demux) is configured to rearrange pixels of the inputimage and dividing the rearranged image into m images, where m is apositive integer not less than 2. It should be understood that since thesize of each output image of the scrambling unit is smaller than thesize of the input image, the scrambling unit is essentially a kind ofdown-sampling, which can reduce the size of the output image. However,the scrambling unit only rearranges and segments the pixels of the inputimage instead of discarding the pixels in the input image, so theprocess of processing by the scrambling unit ensures the integrity ofinformation of the input image.

Specifically, the scrambling unit can move the pixels in the input imageaccording to a preset rearrangement rule or according to a presetscrambling template, and then segment the rearranged image into theplurality of output images. For example, as shown in FIG. 1, the inputimage is rearranged according to the preset scrambling template, thatis, pixels corresponding to all elements a in the input image arearranged together, pixels corresponding to all elements b are arrangedtogether, pixels corresponding to all elements c are arranged together,and pixels corresponding to all elements d are arranged together toobtain the rearranged image. Then the rearranged image is decomposedinto four sub-images, which are the first sub-image, the secondsub-image, the third sub-image and the fourth sub-image, where the firstsub-image includes the pixels corresponding to all the elements a, thesecond sub-image includes the pixels corresponding to all the elementsb, the third sub-image includes the pixels corresponding to all theelement c and the fourth sub-image includes the pixels corresponding toall the elements d.

A merging unit (MUX) is configured to merge the m images into one imageand performing pixel rearrangement on the merged image according to thepreset rule or the template, so the merging unit is essentially aninverse operation of the scrambling unit.

For example, still taking the first sub-image, the second sub-image, thethird sub-image and the fourth sub-image shown in FIG. 1 as an example.Referring to FIG. 2, the first sub-image, the second sub-image, thethird sub-image and the fourth sub-image are input into the mergingunit, and then the merging unit merges the first sub-image, the secondsub-image, the third sub-image and the fourth sub-image to obtain amerged image, where the merged image in the merging unit is equivalentto the rearranged image in the scrambling unit. Then the merged image isrearranged according to a rearrangement mode inverse to that of FIG. 1to obtain an image with the same size as the input image of thescrambling unit in FIG. 1.

In the solution provided by the embodiments of the present disclosure,image enhancement processing is realized through the convolutionalneural network processor, original low-quality images and preset noiseare input into the convolutional neural network processor, andhigh-quality images are obtained through processing by the convolutionalneural network processor. In order to facilitate understanding of thefollowing image enhancement process, the structure of the convolutionalneural network used in the embodiments of the present disclosure will befirst described.

Referring to FIG. 3, the embodiments of the present disclosure provide aconvolutional neural network, including a first convolutional unit 1, aprocessing unit 2 and a second convolutional unit 3; where

the first convolutional unit 1 includes N first convolutional layers 11connected in sequence, and is configured to extract N feature maps withdifferent scales in an image to be processed, where N is a positiveinteger, each first convolutional layer 11 is configured to extract onefeature map, the scales of the feature maps extracted by the differentfirst convolutional layers 11 are different, the image to be processedand the feature maps are all presented in a matrix form, and the featuremaps represent image feature information, such as luminance informationof the image to be processed; exemplary, the scales may represent theresolution of the image;

the processing unit 2 is connected with the first convolutional unit 1and the second convolutional unit 3, and is configured to fuse thepreset noise information received and a first feature map with asmallest scale in the N feature maps with different scales extracted bythe first convolutional unit 1 to obtain a fused second feature map,where the noise information includes preset luminance information andother information, etc.; and

the second convolutional unit 3 includes N second convolutional layers31 connected in sequence, and is configured to fuse the N feature mapsextracted by the first convolutional unit 1 with the second feature mapto obtain a processed image.

In the embodiments of the present disclosure, referring to FIG. 3, inthe convolutional neural network, the first convolutional unit 1, theprocessing unit 2 and the second convolutional unit 3 are connected insequence, an image input to the first convolutional unit 1 is theoriginal image to be processed, an image output by the secondconvolutional unit 3 is the processed image. Generally, output of eachunit in the convolutional neural network is input of the next unit;however, in the solution provided by the embodiments of the presentdisclosure, the input of the processing unit 2 also includes the inputpreset noise information, in addition to the output of the firstconvolutional unit 1; where the noise information may include randomlygenerated luminance information or predetermined luminance information,for example, the noise information is a Gaussian signal.

It should be understood that in the embodiments of the presentdisclosure, in order to allow the operation of the original image to beprocessed input to the convolutional neural network with each layer inthe convolutional neural network, the image to be processed, the featuremaps and the noise information provided by the embodiments of thepresent disclosure are all presented in the matrix form.

In one implementation mode, the convolutional neural network furtherincludes 2N sampling units 4, where the first N sampling units 4 arescrambling units which are arranged at an output end of each firstconvolutional layer 11 and configured to down-sample the feature imageoutput by each first convolutional layer 11, and output of eachscrambling unit serves as input of the next first convolutional layer11; and the last N sampling units 4 are merging units which are arrangedat an output end of each second convolutional layer 31 and configured toup-sample the feature image output by each second convolutional layer31.

Exemplary, in order to reduce the calculation amount of theconvolutional neural network for image processing, referring to FIG. 3,a scrambling unit, which is equivalent to a down-sampling unit, isarranged at the output end of each first convolutional layer 11 in theconvolutional neural network for down-sampling the feature map output byeach first convolutional layer 11, and the down-sampled feature map isused as input of the next first convolutional layer 11, so that thescales of the feature maps output by the different first convolutionallayers s11 are different.

Further, in order to make the size of the image output by theconvolutional neural network the same as that of the image to beprocessed, a merging unit, which is equivalent to an up-sampling unit,is arranged at the output end of each second convolutional layer 31 forup-sampling the feature map output by each second convolutional layer31, so that the size of the image output by the last secondconvolutional layer 31 is the same as that of the image to be processed.

In one implementation mode, the convolutional neural network furtherincludes N interlayer connections for directly inputting the output ofeach first convolutional layer 11 into the corresponding secondconvolutional layer 31, where the first convolutional layers 11 are inone-to-one correspondence to the second convolutional layers 31.

Referring to FIG. 3, in the convolutional neural network provided by theembodiments of the present disclosure, the first convolutional layers 11are in one-to-one correspondence to the second convolutional layers 31,and the first convolutional layers 11 directly input the extractedfeature maps to the corresponding second convolutional layers 31.

In one implementation mode, referring to FIG. 4, the processing unit 2includes a plurality of convolutional blocks 21 connected in sequence,and output of each convolutional block 21 is input of all the subsequentconvolutional blocks 21, where each convolutional block 21 includes athird convolutional layer 211 and a fourth convolutional layer 212.

Exemplary, the processing unit 2 is a dense connected convolutionalnetwork (DenseBlock). Each DenseBlock includes the plurality ofconvolutional blocks 21 connected in sequence, where the output of eachconvolutional block 21 is not only input to the next convolutional block21, but also input to all the later convolutional blocks 21 of the nextconvolutional block 21 at the same time, for example, input to all thelater convolutional blocks 21 of the next convolutional block 21 througha contact function, and correspondingly, the input of each convolutionalblock 21 is the output of all the previous convolutional blocks of it.

In the solution provided by the embodiments of the present disclosure,the structure of each convolutional block 21 may be a “B+C” structure,where B refers to a bottleneck layer, i.e., the third convolutionallayer 211, which is used for dimensionality reduction of data so as toreduce the number of parameters in subsequent convolutional operation;and C refers to a convolution layer, that is, the fourth convolutionallayer, which is used to perform convolutional averaging on the data.

In one implementation mode, referring to FIG. 3, the convolutionalneural network further includes N+1 fifth convolutional layers 5, wherethe fifth convolutional layers 5 are disposed at the input end ofprocessing unit 2 and each of the second convolutional layers 31, whichare configured to perform superposition processing on a plurality ofinput data.

Exemplary, input of the processing unit 2 includes not only the outputof the last first convolutional layer 11 in the first convolutional unit1, but also the input preset noise information. The input of each secondconvolutional layer 31 includes not only the output of the previoussecond convolutional layer 31, but also the output of the correspondingfirst convolutional layer 11. Therefore, in the solution provided in theembodiments of the present disclosure, both the processing unit 2 andthe second convolutional layers 31 have a plurality of input data. Boththe processing unit 2 and the second convolutional layers 31 performconvolutional operation on the input data and the convolutional weightof each convolutional kernel. In order that the output data of theprocessing unit 2 or the second convolutional layers 31 containsinformation of all the input data, the fifth convolutional layer 5 isarranged at the input end of the processing unit 2 or each input end ofthe second convolutional layers 31, and the plurality of input data aresuperimposed so that the processing unit 2 or the second convolutionallayers 31 perform convolutional operation on the superimposed data.

In one implementation mode, the noise information output to theconvolutional neural network includes first noise information and secondnoise information, where the difference between the average value of allelements in the first noise information and the maximum luminancegrayscale is smaller than a first preset threshold, and the differencebetween the average value of all elements in the second noiseinformation and the minimum luminance grayscale is smaller than a secondpreset threshold. For example, both the first noise information and thesecond noise information are the Gaussian noise information, and theaverage value of all elements in the first noise information or in thesecond noise information are an average value of the Gaussian noise.

Exemplary, the first noise information and the second noise informationare both presented in the matrix form, and both the first noiseinformation and the second noise information include luminanceinformation, where the values of all the elements in the first noiseinformation are not larger than the maximum luminance grayscale (255),the difference between the average value of all the elements and themaximum luminance grayscale is smaller than the first preset threshold;and the values of all elements in the second noise information is largerthan the minimum luminance grayscale (0), and the difference between theaverage value of all elements and the minimum luminance grayscale issmaller than the second preset threshold. For example, the value rangeof the first preset threshold is [0-5], and the value range of thesecond preset threshold is [0-5].

In one implementation mode, in order to reduce the calculation amount ofimage processing in the convolutional neural network, each firstconvolutional layer 11, each second convolutional layer 31, each thirdconvolutional layer 211, each fourth convolutional layer 212, and eachfifth convolutional layer 5 include a 1×1 convolutional kernel.

In the solution provided by the embodiments of the present disclosure,the convolutional neural network includes the first convolutional unit1, the processing unit 2 and the second convolutional unit 3; and theimage to be processed is input into the first convolutional unit 1, theN feature maps with different scales are extracted through the firstconvolutional unit 1. Then the first feature map with the smallest scaleextracted by the first convolutional unit 1 is input into the processingunit 2, the first feature map and the noise information received by theprocessing unit 2 are fused to obtain the second feature map, and thesecond feature map is fused with the N feature maps with the differentscales in the second convolutional unit 3 to obtain the processed image.Therefore, in the solution provided by the embodiments of the presentdisclosure, on the one hand, the image feature information in the imageto be processed is extracted through the first convolutional unit 1, sothat processing of other parameter information of the image to beprocessed in the convolutional neural network is avoided, which reducesthe workload of image processing and improves the efficiency of imageprocessing; on the other hand, the extracted image feature informationand noise information are fused through the processing unit 2, so thatthe coverage of the image feature information is increased, whichfurther increases image contrast and improves image quality.

Further, in the solution provided in the embodiments of the presentdisclosure, an image processing method is performed by using theabove-mentioned convolutional neural network, so the convolutionalneural network needs to be established through training before the imageprocessing. The training process of the convolutional neural networkwill be described in detail below. Specifically, referring to FIG. 5,the training process of the convolutional neural network includes thefollowing steps.

Step 501, a convolutional neural network model is established.

According to the convolutional neural network structure described above,an initial convolutional neural network model is established.

Step 502, a pair of training images is selected from a preset sampleset, where the sample set includes a plurality of pairs of trainingimages, and one pair of training images includes a low-quality image anda high-quality image corresponding thereto.

A plurality of pairs of image samples for convolutional neural networktraining are stored in a database of an electronic device in advance.One pair of image samples includes two images with the same content butdifferent contrast ratios, where the image with a high contrast ratio inthe pair of image samples is a high-quality image and the image with alow contrast ratio is a low-quality image.

Step 503, the low-quality image in one pair of sample images is inputinto the convolutional neural network to obtain a training result image.

Step 504, the training result image is compared with the high-qualitysample image to obtain losses.

The losses refer to the difference between the training result image andthe high-quality sample image, where the losses include at least one ofL1 loss, content loss, countermeasure loss or color loss.

Exemplary, referring to FIG. 6 which is a structural schematic diagramof a convolutional neural network generator provided by the embodimentsof the present disclosure, the generator includes a convolutional neuralnetwork, an L1 unit, a Gaussian unit, an analysis network, adiscriminator, a comprehensive loss calculation network, and anoptimizer. The L1 unit is configured to calculate the L1 loss, theGaussian unit is configured to calculate the color loss, the analysisnetwork is configured to calculate the content loss, the discriminatoris configured to calculate the countermeasure loss, and the optimizer isconfigured to adjust the convolutional neural network according to theL1 loss, the content loss, the countermeasure loss and the color loss.

The low-quality sample image in one pair of sample images is input intothe convolutional neural network to obtain the output training resultimage, and the high-quality sample image in the pair of sample imagesand the training result image output by the convolutional neural networkare input into the L1 unit, the Gaussian unit, the analysis network andthe discriminator to calculate the losses.

In order to facilitate understanding of the calculation process ofvarious losses, the structure and calculation process of each unit inthe generator for calculating the losses will be described in detailbelow.

1. The L1 unit calculates the L1 loss by the following formula:

L1=0.299*(abs(R _(i) −R _(g)))+0.587*(abs(G _(i) −G _(g)))+0.114(abs(B_(i) −B _(g)))

where

B_(i) are red, green and blue components in the training result imagerespectively;

B_(g) are red, green and blue components in the high-quality sampleimage; and abs( ) means absolute value operation.

2. The Gaussian unit calculates the color loss by the following formula:

L _(color)=abs(gaussian(I)−(gaussian(G))

where L_(color) represents the color loss; gaussian( ) representsGaussian blur operation; I represents the training result image; and Grepresents the high-quality sample image.

3. The content loss is calculated by the analysis network.

Exemplary, referring to FIG. 7, the analysis network includes aplurality of convolutional layers and pooling layers between every twoadjacent convolutional layers, where each convolutional layer isconfigured to extract feature maps in the training result image and thehigh-quality sample image; and each pooling layer is configured todown-sample the extracted feature maps and inputting the down-sampledfeature maps to the next convolutional layer. After being input to theanalysis network, the training result image and the high-quality sampleimage pass through the plurality of convolutional layers and theplurality of pooling layers, so that the feature maps corresponding tothe training result image and the feature maps corresponding to thehigh-quality sample image are obtained.

Then, according to the feature maps corresponding to the training resultimage and the feature maps corresponding to the high-quality sampleimage extracted by the analysis network, the content loss is calculatedaccording to the following formula:

$L_{content} = {\frac{1}{2C_{1}}{\sum\limits_{ij}( {I_{ij}^{l} - G_{ij}^{l}} )^{2}}}$

where L_(content) represents the content loss; C₁ indicates a presetcoefficient; I_(ij) ^(l) is the value at the j^(th) position of thetraining result image in the feature map output by the i^(th)convolutional kernel in the l^(th) convolutional layer of the analysisnetwork; and G_(ij) ^(l) is the value at the j^(th) position of thehigh-quality sample image in the feature map output by the i^(th)convolutional kernel in the l^(th) convolutional layer of the analysisnetwork.

Further, referring to FIG. 7, the analysis network provided by theembodiments of the present disclosure further includes flatten layers,fully connected layers and Softmax layers which are connected insequence, where the flatten layers are configured to convert the outputfeature maps into a vector form; the fully connected layers have thesame structure as the convolutional neural network, but theconvolutional kernel of the convolutional layer of each fully connectedlayer is a scalar value; and the Softmax layers are used to compress thedata output by the fully connected layer to obtain the probability thatthe output image belongs to each tag in the sample set. However, in thesolution provided by the embodiments of the present disclosure, theanalysis network only uses the feature maps to calculate the contentloss.

4. The countermeasure loss is calculated by the discriminator.

Exemplary, referring to FIG. 8 which is a structural schematic diagramof a discriminator provided by the embodiments of the presentdisclosure, the discriminator includes a plurality of convolutionalblocks, fully connected layers, excitation layers and pooling layersbetween every two adjacent convolutional blocks, which are connected insequence; where each convolutional block includes two convolutionallayers; the excitation layers are configured to perform nonlinearmapping on output of the convolutional layers and converting output datainto scalar data, and the specific excitation layer functions includethe sigmoid function or the rectified linear unit (ReLU) function andthe like.

Based on the above discriminator structure, the discriminator calculatesthe countermeasure loss by the following formula:

L _(D) =−E _(x˜pdata(x))[log D(x)]−E _(z˜pz(z))[1−log D(G(z))]

where L_(D) represents the countermeasure loss; D represents thediscriminator; pdata represents a set of high-quality images in thesample set; x represents items in the set pdata; pz represents a set oflow-quality images in the sample set; z represents items in the set pz;E_(x˜pdata(x)) represents any item in the set pdata; and E_(z˜pz(z))represents any item in the set pz.

In the solution provided by the embodiments of the present disclosure,the discriminator is a classification network for verifying whether thetraining result image is a high-quality image or not and outputting averification result 0 or 1, where the output result 0 indicates that thetraining result image is not a high-quality image and the output result1 indicates that the training result image is a high-quality image.

Further, the discriminator should be trained before calculating thecountermeasure loss through the discriminator, and the training processof the discriminator will be described in detail below.

Referring to FIG. 9, a data flow diagram of discriminator trainingprovided by the embodiments of the present disclosure is shown.Specifically, after the training result image output by theconvolutional neural network and the high-quality image in the sampleset are input to the discriminator, the countermeasure loss iscalculated based on the above-mentioned countermeasure loss calculationformula; and the countermeasure loss is input to the optimizer, and theoptimizer optimizes and adjusts parameters in the discriminator based onthe countermeasure loss, such as, the number of the convolutionalblocks.

Further, in the solution provided by the embodiments of the presentdisclosure, after the L1 loss, the content loss, the countermeasure lossand the color loss are calculated, all the losses are input to acomprehensive calculation network to calculate a comprehensive lossrepresenting the total loss degree. There are many ways to calculate thecomprehensive loss in the embodiments of the present disclosure, whichis not limited here. For example, the comprehensive loss can be obtainedby multiplying the L1 loss, the content loss, the countermeasure loss,and the color loss by corresponding weights respectively, and thenperforming summation.

Step 505, the convolutional neural network is adjusted according to thecomprehensive loss.

The above-mentioned losses represent the difference between the trainingresult image and the high-quality sample image, so parameters in theconvolutional neural network can be adjusted according to theabove-mentioned losses, for example, the weight matrix of theconvolutional kernel in each convolutional layer of the convolutionalneural network, or the number of the convolutional layers.

According to the embodiments of the present disclosure, after theconvolutional neural network is trained based on the training method,the input image is processed based on the trained convolutional neuralnetwork to obtain the high-quality image. The following is a detaileddescription of the image processing process by the convolutional neuralnetwork processor.

Referring to FIG. 10, the embodiments of the present disclosure providean image processing method applied to the convolutional neural networkas shown in FIG. 3, and the method includes:

step 1001, a first convolutional unit receives an input image to beprocessed, N feature maps with different scales in the image to beprocessed are extracted, the N feature maps are sent to a secondconvolutional unit, and a first feature map is sent to a processingunit, where N is a positive integer, the first feature map is thefeature map with the smallest scale in the N feature maps, the image tobe processed and the feature maps are all presented in a matrix form,and the feature maps represent image feature information, such asluminance information of the image to be processed.

In the solution provided by the embodiments of the present disclosure,the first convolutional unit extracts the feature maps of the image tobe processed in a variety of ways, and the following description willtake an implementation mode as an example.

Extracting, by the first convolutional unit, the N feature maps withdifferent scales in the image to be processed includes: each firstconvolutional layer in the first convolutional unit acquires a presetfirst convolutional weight matrix; and each first convolutional layerperforms convolutional operation on a feature map output by the previousadjacent convolutional layer and the first convolutional weight matrixcorresponding to the each first convolutional layer to obtain the Nfeature maps with the different scales.

Exemplary, in the training process of the convolutional neural network,the convolutional weight matrix corresponding to each convolutionallayer can be optimized and stored in a database. When image processingthrough the convolutional neural network is performed, the firstconvolutional weight matrix corresponding to each first convolutionallayer is obtained based on a preset relationship between each firstconvolutional layer and the corresponding convolutional weight matrix;and then each first convolutional layer performs convolutional operationon the input data and the corresponding first weight matrix to obtain afeature map.

In the solution provided by the embodiments of the present disclosure,the convolutional operation, performed by each first convolutionallayer, on the feature map output by the previous adjacent convolutionallayer and the first convolutional weight matrix corresponding to eachfirst convolutional layer includes the following two cases:

Case 1: if any first convolutional layer is a first one of theconvolutional layers in the first convolutional unit, the any firstconvolutional layer performs convolutional operation on the image to beprocessed and the first convolutional weight matrix corresponding to theany first convolutional layer to obtain a feature map.

For example, if the image to be processed is a matrix of 4×4×2 and theconvolutional kernel of the first convolutional layer is a matrix of3×3×2, the convolutional operation process of the image to be processedand the first weight matrix corresponding to the first convolutionallayer in the first convolutional unit is as follows:

Referring to FIG. 11, in the 4×4×2 matrix of the image to be processed,the 3×3×2 image matrix in a dotted line frame and the 3×3×2convolutional kernel are selected for convolutional operation to obtaina 2×2×2 feature map. Specifically, the calculation process of 8 elementvalues in the feature map is as follows:

v1=p1*k1+p2*k2+p3*k3+p5*k4+p6*k5+p7*k6+p9*k7+p10*k8+p11*k9.

Specific calculation of pixel data v2 in row 1 and column 2 in the first2×2×1 matrix in the output image is as follows:

v2=p2*k1+p3*k2+p4*k3+p6*k4+p7*k5+p8*k6+p10*k7+p11*k8+p12*k9.

By analogy, specific calculation of pixel data v3 in row 2 and column 1in the first 2×2×1 matrix in the output image is as follows:

v3=p5*k1+p6*k2+p7*k3+p9*k4+p10*k5+p11*k6+p13*k7+p14*k8+p15*k9.

Specific calculation of pixel data v4 in row 2 and column 2 in the first2×2×1 matrix in the output image is as follows:

v4=p6*k1+p7*k2+p8*k3+p10*k4+p11*k5+p12*k6+p14*k7+p15*k8+p16*k9.

Similarly, specific calculation of pixel data v1-1 and v1-2 in row 1 inthe second 2×2×1 matrix in the output image is as follows:

v1−1=p1−1*k1+p1−2*k2+p1−3*k3+p1−5*k4+p1−6*k5+p1−7*k6+p1−9*k7+p1−10*k8+p1−11*k9;

v1−2=p1−2*k1+p1−3*k2+p1−4*k3+p1−6*k4+p1−7*k5+p1−8*k6+p1−10*k7+p1−11*k8+p1−12*k9.

By analogy, specific calculation of pixel data v1-3 and v1-4 in row 2 ofthe second 2×2×1 matrix in the output image is as follows:

v1−3=p1−5*k1+p1−6*k2+p1−7*k3+p1−9*k4+p1−10*k5+p1−11*k6+p1−13*k7+p1−14*k8+p1−15*k9;

v1−4=p1−6*k1+p1−7*k2+p1−8*k3+p1−10*k4+p1−11*k5+p1−12*k6+p1−14*k7+p1−15*k8+p1−16*k9.

Case 2: if any first convolutional layer is not the first one of theconvolutional layers in the first convolutional unit, the any firstconvolutional layer performs convolutional operation on the feature mapoutput by the previous adjacent first convolutional layer and the firstconvolutional weight matrix corresponding to the any first convolutionallayer to obtain a feature map.

Specifically, the process of performing convolutional operation on thefeature map output from the previous adjacent first convolutional layerand the first convolutional weight matrix corresponding to any firstconvolutional layer is similar to the process of performingconvolutional operation on the image to be processed and the firstconvolutional weight matrix, and will not be repeated here.

Step 1002, the processing unit fuses the received preset noiseinformation and the first feature map, to obtain a second feature map,and the second feature map is sent to the second convolutional unit,where the noise information includes preset luminance information.

After extracting the N feature maps of the image to be processed, thefirst convolutional unit outputs the first feature map with the smallestscale (the feature map output by the last convolutional layer in thefirst convolutional unit) to the processing unit. The processing unitreceives the first feature map and the input preset noise information,and the processing unit fuses the first feature map and the preset noiseinformation to obtain the second feature map, where the processing unitfuses the first feature map and the preset noise information to obtainthe second feature map in a variety of ways, which will be describedbelow by taking an implementation mode as an example.

The Fifth convolutional layer between the processing unit and the firstconvolutional unit receives the input noise information and the firstfeature map, and stack the first feature map and the noise informationto obtain a fifth feature map; each convolutional block in theprocessing unit acquires a preset third convolutional weight matrix; andeach convolutional block performs convolutional operation on the featuremaps output by all previous convolutional blocks and the thirdconvolutional weight matrix corresponding to the each convolutionalblock to obtain the second feature map.

For example, if the first feature map is a 3×3 matrix, the noiseinformation is also a 3×3 matrix, and the convolutional kernels in thefifth convolutional layers are 3×3 matrices. Referring to FIG. 12, afterreceiving the first feature map and the noise information, the fifthconvolutional layers superimpose the two 3×3 matrices in dimension toobtain a 6×3 matrix, and then perform convolutional operation on the 6×3matrix with the 3×3 convolutional kernels in the fifth convolutionallayers to obtain the second feature map.

Step 1003, the second convolutional unit fuses the received N featuremaps with the second feature map to obtain a processed image.

Fusing, by the second convolutional unit, the received N feature mapswith the second feature map to obtain the processed image includes: eachsecond convolutional layer in the second convolutional unit acquires apreset second convolutional weight matrix; and each second convolutionallayer performs convolutional operation on a feature map output by thecorresponding first convolutional layer and a feature map output by theprevious adjacent second convolutional layer to obtain the processedimage.

In the solution provided by the embodiments of the present disclosure,the convolutional operation, by the N second convolutional layers, onthe feature map output by the corresponding first convolutional layerand the feature map output by the previous adjacent second convolutionallayer to obtain the processed image also includes two following cases:

Case 1: if any second convolutional layer is a first one of the secondconvolutional layers in the second convolutional unit, convolutionaloperation is performed on a third feature map and the secondconvolutional weight matrix corresponding to the any secondconvolutional layer to obtain a feature map, where the third feature mapis obtained by stacking the received first feature map and the secondfeature map by a fifth convolutional layer.

Case 2: if any second convolutional layer is not the first one of thesecond convolutional layers in the second convolutional unit, thefeature map output by the previous adjacent second convolutional layeris stacked with the feature map output by the corresponding firstconvolutional layer to obtain a fourth feature map, and convolutionaloperation is performed on the fourth feature map and the secondconvolutional weight matrix corresponding to the any secondconvolutional layer to obtain a feature map.

Specifically, the process of performing convolutional operation on theinput feature map and the corresponding second convolutional weightmatrix by any second convolutional layer is similar to the process ofperforming convolutional operation on the input feature map and thecorresponding first convolutional weight matrix by the firstconvolutional layer, and is not repeated here.

Further, in order to reduce the calculation amount in the convolutionoperation process, the dimensions of each first convolutional weightmatrix, each second convolutional weight matrix, and each thirdconvolutional weight matrix are all 1×1.

In the solution provided by the embodiments of the present disclosure,the first convolutional unit extracts the N feature maps with differentscales from the image to be processed; the processing unit fuses thereceived preset noise information with the first feature map with thesmallest scale extracted by the first convolutional unit to obtain thesecond feature map, and sends the second feature map to the secondconvolutional unit; and the second convolutional unit fuses the receivedN feature maps with the second feature map to obtain the processedimage. Therefore, in the solution provided by the embodiments of thepresent disclosure, the first convolutional unit only extracts the imagefeature information feature map in the image to be processed, whichreduces the workload of image processing and improves the efficiency ofimage processing; and the extracted image feature information and noiseinformation are fused through the processing unit, so that the coverageof the image feature information is increased, which increases imagecontrast and improves image quality.

Further, in order to reduce the calculation amount in the imageprocessing process and improve the image processing efficiency, in step1002 of the embodiments of the present disclosure, after theconvolutional operation of the image to be processed and the firstconvolutional weight matrix corresponding to the any first convolutionallayer is performed, or after the convolutional operation of the featuremap output from the previous adjacent first convolutional layer and thefirst convolutional weight matrix corresponding to the any firstconvolutional layer is performed, the method further includes:

a scrambling unit down-samples the obtained feature map.

In step 1003 of the present embodiment, after the convolutionaloperation of the third feature map and any second convolutional weightmatrix to obtain a feature map is performed, or after the convolutionaloperation of the fourth feature map and the second convolutional weightmatrix corresponding to the any second convolutional layer to obtain afeature map is performed, the method further includes:

a merging unit up-samples the feature map.

Referring to FIG. 13, the embodiments of the present disclosure providean electronic device, and the electronic device includes:

a memory 1301 for storing instructions executed by at least oneprocessor; and

the processor 1302 for acquiring and executing the instructions storedin the memory to implement the above image processing method.

The embodiments of the present disclosure provide a computer-readablestorage medium, and the computer-readable storage medium stores computerinstructions that, cause the computer to perform the above imageprocessing method when executed on a computer.

Those skilled in the art will understand that the embodiments of thepresent disclosure may be provided as methods, systems, or computerprogram products. Therefore, the present disclosure may take the form ofa full hardware embodiment, a full software embodiment, or an embodimentcombining software and hardware aspects. Furthermore, the presentdisclosure may take the form of a computer program product implementedon one or more computer usable storage media (including but not limitedto a magnetic disk memory, an optical memory, etc.) having computerusable program codes embodied therein.

The present disclosure is described with reference to flowcharts and/orblock diagrams of the method, device (system), and the computer programproduct according to the embodiments of the present disclosure. Itshould be understood that each flow and/or block in the flowchartsand/or block diagrams, and combinations of flows and/or blocks in theflowcharts and/or block diagrams may be implemented by computer programinstructions. These computer program instructions may be provided toprocessors of a general purpose computer, special purpose computer, anembedded processor, or other programmable data processing apparatus toproduce a machine, such that the instructions executed by the processorof the computer or the other programmable data processing apparatusproduce a device for implementing the functions specified in one or moreflows in the flowcharts and/or one or more blocks in the block diagrams.

These computer program instructions may also be stored in acomputer-readable memory which can direct the computer or the otherprogrammable data processing apparatus to function in a particularmanner, such that the instructions stored in the computer-readablememory produce an article of manufacture including an instruction devicewhich implements the functions specified in one or more flows in theflowcharts and/or one or more blocks in the block diagrams.

These computer program instructions may also be loaded onto the computeror the other programmable data processing apparatus such that a seriesof operational steps are performed on the computer or other programmableapparatus to produce a computer implemented process, such that theinstructions executed on the computer or other programmable apparatusprovide steps for implementing the functions specified in one or moreflows in the flowcharts and/or one or more blocks in the block diagrams.

Obviously, those skilled in the art can make various changes andmodifications to the present disclosure without departing from thespirit and scope of the present disclosure. Thus, if these modificationsand variations of the present disclosure fall within the scope of theclaims of the present disclosure and their equivalents, the presentdisclosure is also intended to comprise these modifications andvariations.

What is claimed is:
 1. A convolutional neural network processor,comprising a first convolutional unit, a processing unit and a secondconvolutional unit, wherein the first convolutional unit comprises Nfirst convolutional layers connected in sequence, and is configured toextract N feature maps with different scales in an image to beprocessed, wherein N is a positive integer, each first convolutionallayer is configured to extract one feature map; the processing unit isconnected with the first convolutional unit and the second convolutionalunit, and is configured to fuse at least one piece of preset noiseinformation received and a first feature map with a smallest scale inthe N feature maps with different scales extracted by the firstconvolutional unit to obtain a fused second feature map; and the secondconvolutional unit comprises N second convolutional layers connected insequence, and is configured to fuse the N feature maps extracted by thefirst convolutional unit with the second feature map to obtain aprocessed image.
 2. The convolutional neural network processor accordingto claim 1, further comprising: 2N sampling units; wherein first Nsampling units are scrambling units, an output end of each firstconvolutional layer is provided with a scrambling unit, which isconfigured to down-sample a feature image output by each firstconvolutional layer, and output of each scrambling unit serves as inputof next first convolutional layer; and last N sampling units are mergingunits, and an output end of each second convolutional layer is providedwith a merging unit, which is configured to up-sample a feature imageoutput by each second convolutional layer.
 3. The convolutional neuralnetwork processor according to claim 1, further comprising N interlayerconnections configured to directly input an output of each firstconvolutional layer into a corresponding second convolutional layer;wherein the first convolutional layers are in one-to-one correspondenceto the second convolutional layers.
 4. The convolutional neural networkprocessor according to claim 3, wherein the processing unit comprises aplurality of convolutional blocks connected in sequence, and output ofeach of the convolutional blocks is input of all subsequentconvolutional blocks; wherein each of the convolutional blocks comprisesa third convolutional layer and a fourth convolutional layer.
 5. Theconvolutional neural network processor according to claim 4, furthercomprising: N+1 fifth convolutional layers; wherein the fifthconvolutional layers are disposed at input ends of the processing unitand each of the second convolutional layers for performing superpositionprocessing on a plurality of input data.
 6. The convolutional neuralnetwork processor according to claim 5, wherein each first convolutionallayer, each second convolutional layer, each third convolutional layer,each fourth convolutional layer, and each fifth convolutional layercomprise a 1×1 convolutional kernel, respectively.
 7. The convolutionalneural network processor according to claim 1, wherein the noiseinformation comprises first noise information and second noiseinformation; wherein a difference between an average value of allelements in the first noise information and a maximum luminancegrayscale is smaller than a first preset threshold, and a differencebetween an average value of all elements in the second noise informationand a minimum luminance grayscale is smaller than a second presetthreshold.
 8. An image processing method, applied to the convolutionalneural network processor according to claim 1, comprising: receiving, bythe first convolutional unit, the input image to be processed,extracting the N feature maps with different scales in the image to beprocessed, sending the N feature maps to the second convolutional unit,and sending the first feature map to the processing unit, wherein N isthe positive integer, the first feature map is the feature map with thesmallest scale in the N feature maps with different scales; fusing, bythe processing unit, the received preset noise information and the firstfeature map, to obtain the second feature map, and sending the secondfeature map to the second convolutional unit; and fusing, by the secondconvolutional unit, the received N feature maps with the second featuremap to obtain the processed image.
 9. The image processing methodaccording to claim 8, wherein the extracting the N feature maps withdifferent scales in the image to be processed comprises: acquiring, byeach first convolutional layer in the first convolutional unit, a presetfirst convolutional weight matrix; and performing, by each firstconvolutional layer, convolutional operation on a feature map output bya previous adjacent convolutional layer and the first convolutionalweight matrix corresponding to the each first convolutional layer, toobtain the N feature maps with the different scales.
 10. The imageprocessing method according to claim 9, wherein the convolutionaloperation, performed by each first convolutional layer, on the featuremap output by the previous adjacent convolutional layer and the firstconvolutional weight matrix corresponding to the each firstconvolutional layer comprises: if any first convolutional layer is afirst one of the convolutional layers in the first convolutional unit,performing, by the any first convolutional layer, convolutionaloperation on the image to be processed and a first convolutional weightmatrix corresponding to the any first convolutional layer to obtain afeature map; or if any first convolutional layer is not the first one ofthe convolutional layers in the first convolutional unit, performing, bythe any first convolutional layer, convolutional operation on thefeature map output by the previous adjacent first convolutional layerand a first convolutional weight matrix corresponding to the any firstconvolutional layer to obtain a feature map.
 11. The image processingmethod according to claim 10, wherein after the convolutional operationon the image to be processed and the first convolutional weight matrixcorresponding to the any first convolutional layer is performed, orafter the convolutional operation on the feature map output by theprevious adjacent first convolutional layer and the first convolutionalweight matrix corresponding to the any first convolutional layer isperformed, the method further comprises: down-sampling, by a scramblingunit, the obtained feature map.
 12. The image processing methodaccording to claim 8, wherein the fusing, by the second convolutionalunit, the received N feature maps with the second feature map to obtainthe processed image comprises: acquiring, by each second convolutionallayer in the second convolutional unit, a preset second convolutionalweight matrix; and performing, by each second convolutional layer,convolutional operation on a feature map output by a corresponding firstconvolutional layer and a feature map output by a previous adjacentsecond convolutional layer to obtain the processed image.
 13. The imageprocessing method according to claim 12, wherein the convolutionaloperation, performed by each second convolutional layer, on the featuremap output by the corresponding first convolutional layer and thefeature map output by the previous adjacent second convolutional layerto obtain the processed image comprises: if any second convolutionallayer is a first one of the convolutional layers in the secondconvolutional unit, performing convolutional operation on a thirdfeature map and the second convolutional weight matrix corresponding tothe any second convolutional layer to obtain a feature map, wherein thethird feature map is obtained by stacking a received first feature mapand the second feature map by a fifth convolutional layer; or if anysecond convolutional layer is not the first one of the convolutionallayers in the second convolutional unit, stacking the feature map outputby the previous adjacent second convolutional layer with a feature mapoutput by a corresponding first convolutional layer to obtain a fourthfeature map, and performing convolutional operation on the fourthfeature map and the second convolutional weight matrix corresponding tothe any second convolutional layer to obtain a feature map.
 14. Theimage processing method according to claim 13, wherein after theconvolutional operation on the third feature map and the any secondconvolutional weight matrix to obtain the feature map is performed, orafter the convolutional operation on the fourth feature map and thesecond convolutional weight matrix corresponding to the any secondconvolutional layer to obtain the feature map is performed, the methodfurther comprises: up-sampling, by a merging unit, the feature map. 15.The image processing method according to claim 14, wherein the fusing,by the processing unit, the received preset noise information and thefirst feature map, to obtain the second feature map comprises:receiving, by a fifth convolutional layer between the processing unitand the first convolutional unit, the input noise information and thefirst feature map, and stacking the first feature map and the noiseinformation to obtain a fifth feature map; acquiring, by eachconvolutional block in the processing unit, a preset third convolutionalweight matrix; and performing, by each convolutional block,convolutional operation on the feature maps output by all previousconvolutional blocks and the third convolutional weight matrixcorresponding to each convolutional block to obtain the second featuremap.
 16. The image processing method according to claim 15, wherein thedimensions of each first convolutional weight matrix, each secondconvolutional weight matrix, and each third convolutional weight matrixare all 1×1.
 17. An electronic device, comprising: a memory for storinginstructions executed by at least one processor; and the processor foracquiring and executing the instructions stored in the memory toimplement the method according to claim 8.